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1 Introduction 

1.1 Background 

Implementing geometric algorithms can be a difficult task. It has been found out and repeatedly 
rediscovered that there is a huge gap between geometric algorithms as they are described in most 
theoretical papers and their implementation in software. Two issues that are often ignored in the 
theoretical approach turn out to be critical in practice: Degeneracies and numerical precision. These 
issues are collectively referred to as "robustness" and they have been the topic of extensive research. 
Surveys on the topic can be found in [24], [27], [36], also several brief state-of-the-art summaries on 
the topic are collected in [25]. 

In theory degeneracies are often handled by assuming general position, namely assuming that 
degeneracies do not occur. The general position assumption had contributed significantly to the 
advancement of geometric algorithms by letting the researchers focus on the key (theoretical) prob- 
lems while ignoring many technical issues. When implementing a geometric algorithm however, 
degeneracies must be taken into consideration. 

The numerical precision problem was solved in the theory of geometric algorithms by assuming 
infinite precision real arithmetic [28] . For certain algorithms and geometric objects this assumption 
is realizable in practice by using exact arithmetic [1], [3], [4], [6], [15], [31], [37]. Computing with exact 
arithmetic is in general more costly than using floating point arithmetic, and in certain cases not 
realizable because of the geometric primitives that need to be manipulated. Here again there is 
a gap between what could in theory be handled by exact arithmetic and what current technology 
offers [14]. 

The software package that we describe in this paper computes the boundary of a union of spheres, 
the surface area of the boundary, and the intersection pattern of any sphere with all the other spheres 
in a given set. We therefore have to compute the intersection of pairs and triples of spheres, various 
tangency points of great circles and little circles 1 on a sphere, and so on. Such computations are 
not straightforward to carry out using exact arithmetic (all these operations require the solution of 
polynomial equations, so theoretically symbolic schemes could possibly be used here). Also floating 
point arithmetic has the obvious advantages of availability and efficiency. Our goal here is to devise 
robust algorithms that deal with intersecting spheres in H 3 while using floating point arithmetic. 
Some examples of previous and related work on robust floating point geometric algorithms can be 
found in [17], [18], [22], [27], [32], and [33]. 

Our motivating application is geometric modeling of molecules. Our software package is part of 
a toolbox aimed to support the chemist in the drug design process [12], [13]. The basic geometric 
model of a molecule that we use is the so-called hard sphere model where every atom is represented 
by a sphere at some fixed position relative to the other atom spheres in the molecule. Since the hard 
sphere model is an approximate model to begin with, we have the freedom to perturb the spheres 
slightly without much effect on the relevance of the model. 

When computing with floating point arithmetic we cannot precisely determine degeneracies; we 
can only know that we are in a potentially degenerate situation. For example, we may not be able 
to determine for sure that four spheres in our collection meet at a single point, but we could detect 
that the point of intersection of three out of the four spheres is less than some e > away from a 
point of intersection of a different triple of these spheres. For a small parameter e > which we call 
the resolution parameter, we regard such a pair of intersection points as a degeneracy. (A "potential 
degeneracy" may be a more appropriate term, but for brevity we will refer to such a situation as 
degeneracy.) Our new scheme guarantees that for a given parameter e > 0, all the features of 
the spherical arrangement are at least e apart (a formal definition of the e-separation is given in 



1 Throughout the paper we use the term little circle to mean any circle on a sphere S that is the intersection of S 
with another sphere. 




Figure 1: The sphere model of a molecule. The arrow points to the sphere whose spherical arrange- 
ment is drawn in Figure 2 (a). 

Section 4). The resolution parameter e depends on the floating point precision and on the type 
of operations (e.g., computing the intersection points of three spheres). We assume here that e is 
given. There are numerical analysis methods to compute useful bounds on e; see [23, Chapter 4] for 
examples concerning linear objects. We point out that in our algorithms the 'depth of operations', 
namely how many times in a row the result of one operation is the operand in another operation, is 
bounded by a small constant. Therefore one can obtain a bound on the resolution parameter that 
does not depend on the input size n — the number of spheres, in our case. 

1.2 Summary of Results 

We present an efficient perturbation scheme for a collection M of n spheres in M 3 that makes our 
geometric algorithms robust. We call the decomposition of H 3 induced by the spheres the arrange- 
ment of the spheres; the subdivision of each sphere by the circles of intersection with other spheres 
we call a spherical arrangement. We denote the maximum number of spheres in M intersecting 
any single sphere in M by k (as was shown in [21], k is a constant for the hard sphere model of 
molecules; see also Section 2.2). For any given resolution value e > 0, we determine a parameter 
S that depends on e, on k, and on the maximum radius R of a sphere in M. We then present a 
scheme that perturbs each sphere by at most <5, resolves all the degeneracies in the arrangement of 
the spheres, and runs in 0(n) time. 

We also take care of degeneracies that result from a further decomposition (or refinement) of 
the spherical arrangements known as the trapezoidal decomposition (see Section 2.1). Since in the 
trapezoidal decomposition we are free to choose a direction for the 'poles' (two antipodal points on 
a sphere, such that all the arcs added in the refinement are portions of great circles through these 
poles) , we choose the poles so that the angular separation of the added arcs will be above a certain 
threshold lj. Here also, we choose u such that determining the poles could be done fast. 



We have implemented this perturbation scheme and we report experimental results below. For 
example, letting e = 10 -11 , S = 10~ 9 , and to = arccos(l — 10 -11 ), loading a molecule with 2034 
atoms (namely computing the boundary surface and all the individual spherical arrangements) takes 
146.8 seconds, including the perturbation time, on an SGI Indigo 250Mhz RS4400. Our experimental 
results show that in most cases the time taken up by the perturbation scheme is negligible (Section 6) . 

Our software is part of a toolbox of data structures and algorithms aimed to support the chemist 
in the rational drug design process [2] , and it is currently being used by chemists in a pharmaceutical 
company. Further details on the larger software (the toolbox) can be found in [12]. 

1.3 Comparison with Related Work 

As mentioned above our approach to robustness can be categorized as fixed precision approxi- 
mation. The approximation is achieved by a controlled perturbation that removes all degeneracies. 
Our approach requires a detailed analysis of all degenerate cases that can arise in the arrangement 
and its refinement. In that sense it shares a feature with the work of Burnikel et al.[5] that solicits 
the direct handling of degeneracies. Of course the big difference between our approach and theirs is 
that they use exact arithmetic. To explain the advantages and disadvantages of both approaches, 
let's take a look at what most algorithms that compute arrangements do. 

A typical algorithm for computing arrangements consists of two intertwined parts: (i) computing 
features, usually vertices of the arrangements and special points such as points of vertical tangency, 
and (ii) computing adjacencies between features to create a 'map' of the subdivision that can 
be traversed, cell by adjacent cell. The first step is usually ignored in the computational geometry 
literature (unless some sophisticated data structures are necessary to identify the features) and when 
it is non-trivial, the difficulty lies in the algebra. The second step is more challenging, especially 
when degeneracies need to be taken into account. Burnikel et al. [5] handle degeneracies also at the 
second stage. Indeed they achieve a clean solution for a 2D problem. We work with degeneracies 
only at the first stage, which is much simpler since we only work with features rather than with 
adjacencies. When our algorithm gets to the second stage it is guaranteed that all degeneracies have 
been removed and this makes programming far simpler. We believe that extending their approach 
to three and higher dimensions will be a difficult task, because of the large number of special cases 
that needs to be handled; this has motivated the body of research on perturbations schemes for a 
long time [10], [11], [30], [35]. It would be interesting to compare the methods for three-dimensional 
arrangements — the setting of our work. 

The obvious disadvantage of our approach relative to theirs and to any scheme that uses exact 
arithmetic is that we approximate the input, which may be unacceptable in certain applications. 
However, for many applications a slight approximation is permissible. A large number of engineering 
and physical world models are approximate to start with, and a controlled perturbation of the 
type proposed here is within the error bounds of the measurement/model. Moreover, there are 
application domains where workers prefer approximations of geometric objects to precise objects, 
because running time, for example, is more important than precision. In such cases, our scheme, if 
applied, can be viewed as part of the approximation. 

Another approach in the framework of fixed precision approximation is snap rounding [22] . This 
approach has been recently revisited [17] and has been successfully implemented for two-dimensional 
arrangements of segments. The disadvantage of this approach relative to ours is that not only 
snap-rounding does not resolve degeneracies, it in fact creates degeneracies. In two-dimensional 
arrangements this hardly has any effect on the difficulty of programming, but we believe that in 
three-dimensional space (e.g., arrangements of triangles in 3-space), again because of the variety of 
degenerate cases, it would be much more difficult to define and implement. 

A common approach to fixed precision robustness, in the folklore of practitioners of geometric 
computing, is as follows: apply a small random perturbation to all the input objects, then run the 



algorithm, and if a problem 2 occurs, start again by applying a random perturbation to the objects. 
Our scheme is different in several ways. Our scheme is guaranteed to succeed (provided that the 
right parameters are chosen) and it is guaranteed to work efficiently. The efficiency comes in part 
from the fact that we work incrementally, and once we have finished inserting an object to the 
arrangement (possibly after several attempts of inserting it in different places) we will not move this 
object again. The price that we pay is in the detailed analysis of degeneracies, and programming 
the tests to check if they arise. 

If degeneracies are not removed, either exact arithmetic must be used to determine precisely 
when an exact degeneracy occurs and the geometric algorithm must take such cases into account, or 
the algorithm will fail. The nature of the failure depends on the algorithm. In the case considered 
here, during the construction of the surface patches the connectivity of the patches will be self 
conflicting (i.e., the graph of patch connectivity will no longer correspond to a valid topology). This 
results either in the algorithm entering an infinite loop (since the assumptions on which it was based 
have been violated) or in surface area calculations which make no physical sense (e.g., some spheres 
are attributed with negative surface areas or with a contribution to the surface greater than the 
total surface area of the sphere) . 

Ours is definitely not the first implementation of an algorithm for computing molecular surfaces; 
see, e.g., [7], [9], [29], [34] and the recent survey by Connolly [8]. We believe that our approach stands 
out in its efficient treatment of robustness issues, and in the ease and flexibility of computing 
substructures of the collection of the spheres, such as the union of a subset of the spheres of one 
molecule consumed by another molecule. 

1.4 Paper Outline 

The rest of the paper is organized as follows. In the next section we give more background 
details on spherical arrangements and on the hard sphere model of a molecule. In Section 3 we 
review our software package. The key ideas underlying our new perturbation scheme are presented 
in Section 4. Algorithmic details of the scheme and running time analysis are given in Section 5. In 
Section 6 we report experimental results. More practical issues concerning our implementation are 
discussed in Section 7. A brief summary and suggestions for future research on the topic are given 
in Section 8. In the Appendix we complete technical details concerning shapes and volumes that 
arise in the perturbation scheme. 

2 Preliminaries: Spheres and Molecules 

2.1 Little Circles on a Sphere 

A collection of circles on a sphere S induces a partitioning of the sphere into vertices, edges and 
faces. We call such a partitioning a spherical arrangement; see Figure 2(a). If all the circles are great 
circles, then one can transform the spherical arrangement into a planar arrangement of lines [20], 
which is a simpler object to handle [16]. However, in our application the circles are not necessarily 
great circles. Each of the circles is the result of the intersection of S with another sphere. We refer 
to these more general circles as little circles. 

The faces in the spherical arrangement need not be simply connected, and each face may have 
a large number of edges on its boundary. We can apply a standard refinement procedure, called a 
trapezoidal decomposition that will make each face homeomorphic to a disc and have at most four 
edges on its boundary, as illustrated in Figure 2(b); see, e.g., [20, Section 21.3] for more details on 
trapezoidal decompositions. In this context, we fix a pair of antipodal points as poles. We call the 



2 What a problem is depends on the algorithm and the implementation and refers to any unexpected or undesired 
outcome of the program. 
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Figure 2: A spherical arrangement (a), its full trapezoidal decomposition (b), and partial trapezoidal 
decomposition (c). 

great circles through the poles polar circles, arcs of polar circles polar arcs, and any point on a little 
circle that is tangent to a polar circle we call a polar tangency. For every polar tangency of every 
circle in our collection, we extend a polar arc in either direction until it hits another little circle or 
reaches a pole. We do the same from every intersection point of a pair of little circles. We call this 
refinement the full trapezoidal decomposition. 

If we are only concerned with making each face simply connected and making the graph of all 
the edges of the arrangement connected then using a partial trapezoidal decomposition, in which 
polar arcs are only extended from polar tangency points and not from intersections, suffices. See 
Figure 2(c). 



2.2 The Hard Sphere Model 

A common approach to representing the three-dimensional geometric structure of a molecule is 
to represent each of its atoms by a "hard" sphere. In certain applications it is also assumed that 
the relative displacement of the spheres is fixed. There are recommended values for the radius of 
each atom sphere and for the distance between the centers of every pair of spheres. In this model, 
the spheres are allowed to interpenetrate one another, therefore it is sometimes referred to as the 
"fused spheres" model (see Figure 1). The envelope surface of the fused spheres may be regarded as 
a formal molecular surface. It is evident that various properties of molecules are disregarded in this 
simple model. However, in spite of its approximate nature, it has proven useful in many practical 
applications. For more background material and references, see for example the survey paper by 
Mezey [26]. 

In [21] the hard sphere model is studied from a computational geometry point of view. Several 
observations are made in that paper showing that, because of certain special properties, the spheres 
in this model can be efficiently manipulated. We cite below the results that will be needed in later 
sections. Theorem 2.1 states the conditions that make the sphere model of a molecule favorable. 
Theorem 2.2 summarizes a hash-table based data structure that is constructed exploiting these 
conditions. 

Theorem 2.1 [21] Let M = {B\, . . . ,B n } be a collection ofn balls in 3-space with radii n, . . . ,r„ 
and centers at c\ , . . . , c n . Let r = min, r, and let R = maxj r^. Also let S = {Si ,... ,S n } be the 
collection of spheres such that Si is the boundary surface of Bi . If there are positive constants p\,pi 
such that y < Pi an d for each Bi the ball with radius pi -r, and concentric with Bi does not contain 




Figure 3: Screen shot of interactive application showing the molecule conocurvone. A carbon atom, 
the checked one on the lower right of the viewing window, has been selected and the partial vertical 
decomposition for it is shown in the upper left window. The viewing window displays the molecule 
with each atom colored by element type and shaded by area contributed to the surface. 

the center of any other ball in M (besides c,-), then: (i) for each Bi £ M, the maximum number 
of balls in M that intersect it is bounded by a constant, and (ii) the maximum combinatorial 
complexity of the boundary of the union of the balls in M is 0(n). 

Theorem 2.2 [21] Given a collection M of n balls as defined in Theorem 2.1, one can construct a 
data structure using 0(n) space, to answer intersection queries for balls whose radii are not greater 
than R, the maximum radius of the balls of M, in 0(1) time. The expected preprocessing time of 
the structure is 0(n). 



3 Overview of the Software Package 

The algorithms described in the paper are embedded in a couple of different applications for 
computational chemists. One of these applications is a batch processor to take a single molecule 
or pairs of molecules and compute and output surface statistics. The other application is a real- 
time viewer with which the user can manipulate a three-dimensional model of the molecule (or pair 
of molecules) and display various statistics through either the shading and color of the model at 
different atoms, or through other three-dimensional models of individual atoms. A screen shot of 
this second application can be seen in Figure 3. 

Among the surface calculations interesting to chemists are the surface area, the void areas (the 
area of the boundry of hidden "void pockets" in the molecule or between two molecules) , and the 
interaction of surface areas. All of these statistics are useful on a per atom basis. The last statistic 
(surface interaction) is of special interest. For this, the user would like a measure of what portion 
of one molecule is "eaten" by (or part of the intersection of) another. A measure of the intersection 
of the two molecules, broken down by atom, helps the user to understand which atoms of the drug 
and protein are involved in the binding of the drug molecule. It is important to understand which 



features of the drug molecule are necessary and which are changeable during the course of novel 
drug design. 

There are many different measures of such a consumption. For our program, we chose to report 
the difference of the surface area of the two molecules together (and intersecting) and the area 
of the molecules separately. Since we explicitly decompose each sphere by all the little circles of 
intersection, we can be very flexible in our surface computations. 

In order to construct the surface, a seed patch (or a surface region guaranteed to be on the sur- 
face) is found and the surface is grown by following the connectivity of the spherical decompositions. 
In our search, when crossing an edge created by a little circle, we have the choice of continuing on 
the current sphere or jumping to the intersecting sphere. To construct the entire surface, we always 
choose to change spheres. However, we can remove any arbitrary set of spheres from an already 
constructed decomposition simply by choosing to stay on the current sphere when crossing any edge 
made by an intersection with a sphere in the omitted set. 

In this manner, the decomposition of spheres can be computed once and then reused for any 
surface calculation needed including: omitting certain atoms (like all hydrogens), considering only 
certain types of atoms, or omitting whole molecules. In addition, surface calculations can attribute 
to each atom its portion of the surface area. 

This paper will concentrate on the robust computation of the spherical decompositions. Higher 
level data structures for keeping track of atoms within multiple molecules and lower level code for 
computing intersections and storing the geometry of points and arc segments will be assumed for 
our discussion. 

For representing the spherical decompositions, we chose a variant of the quad-edge data structure 
[19]. This structure makes updating the subdivision simple. Each arc is stored as a plane and two 
points (thus the arc is the intersection of the plane and the sphere between the two points) . Four 
operations are needed: adding an arc, removing an arc, splitting an arc, and merging two connected 
arcs (namely two arcs that belong to and are adjacent along a single circle). The maintenance of 
this data structure is fairly standard and we omit the details here. 

The data structure representing the spherical arrangement assumes that the graph of arcs is 
connected. This means that all of the regions must be simply connected. To ensure this, we break 
up the surface with additional arcs in order to create a trapezoidal decomposition, as described in 
Section 2.1. 

We have presented so far the ingredients that are necessary to understanding the perturbation 
scheme. Further details on our software implementation are given below in Section 7. 

4 The Perturbation Scheme 

We distinguish two types of degeneracies that arise in a collection M of intersecting spheres and 
atom maps as we compute them. Recall that we are concerned with floating point arithmetic and 
that we define a degeneracy using a resolution parameter e > 0. In this section we give a precise 
definition of the various degeneracies. We describe the two types as they occur on a single sphere. 
See Figure 4. 

Type I A little circle is too small, two little circles are tangent (or almost tangent), or two intersec- 
tion points each being the intersection of three spheres are close together. Had we been using 
exact arithmetic these (potential) degeneracies would correspond respectively to the following 
exact degeneracies in the three-dimensional arrangement of spheres: Two spheres are tangent 
to one another, three spheres intersect in a single point, or four spheres intersect in a point. 

Type II The angle between the planes containing two distinct polar arcs is too small (it is below 
some threshold to). 




type I type I 

Figure 4: Examples of the two types of degeneracies. Type I degeneracies are marked by small 
shaded circles. 

Type I is inherent to the arrangement of spheres, whereas type I is an artifact of our decomposition 
method. 

We wish to perturb the spheres slightly so that all the features of the arrangement will be at 
least e apart, for a given resolution parameter e; an exact and formal definition of "features being 
e-apart" is given in the following section. We would like the perturbation procedure to be efficient 
and at the same time that the perturbation magnitude will be as small as possible. For a given 
e our procedure will determine a perturbation value S that will guarantee that the procedure will 
take 0(n) time for n spheres. 

We use a two-step perturbation scheme: 

Step 1. We remove type I degeneracies by an incremental procedure where we add the spheres 
one-by-one and if a degeneracy occurs we only perturb the last sphere that has been added. 

Step 2. We choose the orientation of the pole (namely the direction to which all of the planes 
containing polar arcs will be parallel) that will eliminate degeneracies of type I. 

4.1 Removing Type I Degeneracies 

Let Si, £2, ■ ■ ■ , £„ be an ordering of the spheres in M. For a pair of spheres 5, and Sj let pij 
denote the distance between their centers. Let M t denote the set U t i=1 {Si}. After the completion 
of stage t, the incremental procedure will maintain the following invariants: 

I\ The center of any sphere in M has been moved by at most S from its original placement (6 is a 
constant to be determined below). 

I2 For every pair of distinct spheres Si and Sj, i,j < t, and r,- > r,, we have \pij — r, — rj\ > e and 
\Pij +ri-rj\ > e. 



^3 For every triple of distinct spheres Si,Sj and Sk, i,j,k < t, the circle Cy := 5, (~l Sj and the 
sphere Sk are not tangent, and are at least e away from being tangent. (The formal definition 
of this invariant is given in the Appendix.) 

I\ Let Ti and Tj each be an intersection point of three spheres in M t ; then d(ji, Tj) > e. 

Invariants I 2 through 7 4 correspond to insuring that the degeneracies of type I are avoided by a 
margin of at least e. We call a perturbation scheme that satisfies the above invariants at the end of 
each stage (and in particular after the nth stage) a valid perturbation scheme. 

Suppose that the procedure has been carried out successfully for the first t stages. We next 
describe how we add the sphere S t +i so that M t +i will maintain the invariants above. We denote 
the center of sphere 5, for i < t after the completion of step t by C[. Let B(C t +i,6) be the ball 
of radius S around the original placement of the center Ct+i of St+i ■ We will place the center of 
St+i inside B(Ct+i,S) and this will guarantee the invariant I\. The invariants I 2 ,I 3 and I\ define 
forbidden loci F 2 ,F 3 and F4 respectively for the center of St+i- We will choose the new placement 
C' t+1 of the center of S t +i to be in G t +\ ■= B(C t +i,6) \ (F 2 U F 3 U F 4 ). If such a placement exists, 
we call it a valid placement of the center of St+i . If the original placement of St+i lies in Gt+\ 
then we do not move St+i — this will guarantee that if the features of the arrangement of the input 
spheres are already e-apart, no perturbation will take place. 

The region F2 is the union of spherical shells 3 of two types. Each sphere Si € M t induces two 
spherical shells according to whether the tangency is external (namely, at the tangency point the 
interior of the spheres are disjoint) or internal. One shell has its center in C\ and its radii are 
r, + r t — s and r, + r t + s. This spherical shell represents the loci of placement of the center of S t 
that result in the spheres Si and St being tangent, or almost tangent. The other type of shells is 
defined similarly. 

We postpone a detailed explanation of the shape of the objects contributing to F 2 ,F 3 , and F 4 to 
the Appendix. We just mention that F 4 is also the union of spherical shells, whereas F 3 is the union 
of toric shells. In the Appendix we derive upper bounds on the volume taken up by the regions 
F 2 , F 3 , and F A . Let VF := Volume(F 2 U F 3 U F A ) and let VB := Volume( J B(C t+ i , 5)) . 

Our goal is to choose S such that VB > 2 • VF. If we do that, we are guaranteed that a point 
chosen uniformly at random inside B(Ct+i,6) has probability > \ to be a valid placement for the 
center of S t +i- Our calculations, which are given in the Appendix, show that we need to choose 
6 = f(k,e,R) = 2ke 1 / 3 R 2 / 3 , where k is the maximum number of spheres in M intersecting any single 
sphere in M, R is the maximum radius of a sphere in M, and assuming k > 10. Not surprisingly, 
our experimental results show that this is a very conservative bound. The theoretical bound is a 
crude worst-case bound, and even as such it shows that our approach does not conceal any very 
large constant. 

4.2 Removing Type II Degeneracies 

After we have removed all type I degeneracies, we wish to choose a direction for the poles such 
that no type I degeneracy arises, when using the partial vertical decomposition. (The method can 
be extended to deal with full vertical decomposition, and we report below experimental results for 
the full decomposition as well.) We will handle each sphere separately, namely choose the direction 
for the poles independently for each sphere. 

The situation here is significantly different from the situation with type I degeneracies: It is 
impossible to guarantee an e-separation of the polar arcs. Polar arcs can get arbitrarily close to 
one another, and they can coincide in the poles. Therefore our goal is to obtain a good angular 
separation, which we call u -separation, and is defined as follows: Once the direction of the poles has 



3 A spherical shell is the region enclosed between two concentric spheres. 




Figure 5: A cross-section of Cu and II. 



been determined, for any pair of distinct polar arcs the angle between the planes containing them 
is at least to. We wish to determine the biggest to that will allow us to run Step 2 efficiently. If 
w-separation has been obtained for an u value which is greater than the floating point resolution, 
polar arcs can be safely and consistently distinguished during computation. 

We next describe this procedure for a single sphere S t , after Step 1 has been completed (and 
type I degeneracies have been removed). Fix two little circles Cu and Cjt on St, each being the 
intersection of St with another sphere in M (Si and Sj respectively), and let $ be a great circle on 
St that is tangent to Cu and Cjt- 

In general, $ is one of at most four great circles tangent to these two little circles: There is a 
one-to-one mapping from great circles to planes passing through the center of the sphere (a great 
circle is the intersection of such a plane with the sphere). Let II be a plane passing through the 
center of the sphere which we will parameterize by its normal n w and, by an abuse of notation, we 
will also let n„- denote the point on the unit sphere such that the normal n w is the difference of that 
point and the origin. If we restrict II to pass tangent to the circle Cu, then n w must lie on a little 
circle on the unit sphere. 

This can be shown by constructing the sphere Gu whose center, g it , lies on the lie between c, 
and c t such that for any point p on the circle Cu, (p — git) is parallel to n w (see Figure 5). If we let 
r„ be the vector from g it to p, we note that the set of all possible r„ forms a little circle (namely 
Cu) on the sphere Gu- Thus, since n^ is parallel to r w , we know that the normals to II must lie on 
one of two little circles on the unit sphere (one corresponding to Cu and one corresponding to the 
reflection of Cu about gu since if n is a normal of a plane, — n is also a normal of the same plane). 

Thus, if we restrict II to be tangent to two little circles (Cu and Cjt), we are restricting n w 
to be at the intersection of two sets of two little circles on the unit sphere. This yields up to 8 
solutions provided that the sets of little circles are not identical. However, since each solution will 
be produced along with its negative (which will correspond to the same plane and thus the same 
great circle), we conclude that unless Cu and Cjt are parallel and of the same radius, there will be 
at most four great circles of S t which are tangent to both Cu and Cjt ■ 

The circle $ is the locus of pole locations that will cause the planes containing the polar arcs 
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n' n it" 



Figure 6: A cross-section of S t - 

extending from the tangency points of $ with Cu and Cjt to coincide, namely have zero angular 
separation. For the moment, assume that there are only 4 possible positions for $. Let II' and 
II" be two planes parallel to II, each on a different side of II, and such that any plane n passing 
through the center of St and tangent to the circle II' n St (or the circle II" n St) makes an angle 
u with II. See Figure 6 for an illustration. We call the portion of St between II' n St and II" n St 
the co-strip of $, and denote it by a(co, $). It is easily verified that if the poles are chosen outside 
er(w, $) then the polar arcs that correspond to the tangencies of $ with Cu and Cj t are at least w 
apart. 

Every tangent circle $ of every pair of circles on St defines an w-strip a(tu, $). Assume without 
loss of generality that the radius of St is 1. The area of any such strip is 47rsinw. The maximum 
number of strips that need to be considered on St is 4(*). For Step 2 to run in O(n) time we wish 
that at least half of the surface area of S t will not be covered by w-strips (any constant fraction will 
do; the smaller the uncovered fraction the higher the constant factor in the expected running time 
bound). Therefore we choose to such that 



2 • 4 ( ) 47r sin w < 2 • 4tt 



sinw < l/(2fc(fc-l)) 



We now consider the case where Cu and Cjt are parallel and have equal radii, thus producing 
an infinite set of possible $. In this case, we are guaranteed that Cu and Cjt will lie on opposite 
sides of c t (if they did not, it would violate the assumptions of Theorem 2.1). For any $, when we 
choose a pair of poles on $, we split $ into two equal halves. In this degenerate case, for any choice 
of these poles, the two tangency points will be on different halves. Polar arcs can run, at maximum, 
from pole to pole (only one half of a great circle) and thus can be distinguished from polar arcs 
which run on the "other half" of the same great circle. Each tangency will induce at maximum a 
polar arc that will run on one half of $, but both will be on opposite sides of the poles thus can 
can be differentiated. Hence, we only need to be sure that the pole does not lie on or too close to 
the circles Cu or Cjt (which would cause another degeneracy). To do this, we add an additional 
constraint that the pole direction may not be within e (the same e as from the removal of type I 
degeneracies since we are attempting to distinguish between points) of any little circle. 
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Remarks. 

(1) The w-strip is only an inscribing region of the forbidden region corresponding to any $. The 
actual forbidden region induced by any tangency is in fact smaller, and it has a more complicated 
boundary. 

(2) Extending the method to the full vertical decomposition requires handling several more cases 
involving up to four circles C, t each. This extension is straightforward and we omit further details 
here. We have implemented this extension and we report experimental results for it below. 

(3) In practice, it is convenient to impose the same pole direction for all the spheres in the arrange- 
ment, for ease of coding, debugging and visualization. To this end we draw all the w-strips on the 
unit sphere of pole directions S 2 , and the constraint on u becomes sin a; < l/(4nk(k — 1)). The 
experimental results described below are for this choice of pole directions. 

(4) The construction of the forbidden regions on S t is diametrically symmetric, namely a point on 
St is free (i.e., represents a pole that will induce an w-separation) if and only if its antipodal point 
is free. 

5 Algorithmic Details and Complexity Analysis 

In this section we describe how the perturbation scheme is carried out efficiently. 

The incremental procedure of Step 1 will move the center of each sphere by at most 8 from 
its original placement. We expand each of the original spheres St into a sphere S" whose radius 
is r t + S, and let M" be the set of the expanded spheres S' t ' . We construct a data structure as 
described in Theorem 2.2 to support range queries on the spheres in M" . This structure enables 
us to find in 0(1) time all the spheres in M" that intersect a given query sphere S" . Since we use 
expanded spheres, the structure can be used for detecting intersections with original as well as with 
perturbed spheres. 

When looking for a perturbation of the center of St during Step 1 of the procedure we proceed 
as follows. Although constructing the subdivision of B(Ct,S) into free and forbidden regions could 
be done in maximum constant time, it would be an extremely difficult task that may introduce 
new precision problems. Instead we do something much simpler. We choose a point p uniformly at 
random in the ball B(Ct,S). We next check whether p is a valid perturbation by checking that it lies 
outside all the forbidden regions F2, -F3 and F4. For example, to check whether p lies outside F2 we 
check whether it lies outside each of the at most 2k spherical shells defining F2 . Each of these tests 
is simple and fast and their overall number is bounded by a constant (see below for experimental 
results) . Recall that S was chosen such that with probability > \ , the point p will represent a valid 
perturbation. Thus the expected number of trials before we find a valid p is < 2, and the overall 
running time of Step 1 is 0(n). 

Similar arguments apply to Step 2 procedure. For any given sphere St, we choose a point p 
uniformly at random on the boundary of S 2 . We then test all the w-strips relevant to St to check 
whether p lies outside all those strips. By the choice of to the expected number of trials before we 
find a valid p is < 2, and the overall running time of Step 2 is 0(n) as well. 

We summarize the performance of the perturbation scheme in the following theorem. 

Theorem 5.1 Given a collection M of n spheres as described in Theorem 2.1, and a resolution 
parameter s > 0, a valid perturbation of the spheres in M can be computed in expected 0(n) 
time, by moving each sphere by at most S from its original placement, where S is a parameter that 
depends on e, on the maximum number of spheres in M intersecting any single sphere in M, and on 
the maximum radius of a sphere in M, and such that all the degeneracies are resolved. In expected 
0(n) time we can also find a direction for the poles on each sphere so that all the polar arcs in the 
trapezoidal decomposition of the spherical arrangements lie on planes the angle between any pair of 
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number 


file name 


size 


k 


mean k' 


1 


estradiol. mol2 


44 


17 


6.81 


2 


clofazimine. mol2 


55 


13 


6.67 


3 


michellamine_b.mol2 


104 


13 


6.81 


4 


288d.pdb 


120 


9 


6.25 


5 


conocurvone.mol2 


127 


14 


6.67 


6 


245d.pdb 


240 


9 


6.17 


7 


lppt.pdb 


301 


10 


5.92 


8 


4pti.pdb 


454 


10 


5.79 


9 


lbzm.pdb 


2034 


10 


5.74 


10 


2pka.pdb 


3598 


11 


5.79 


11 


2ace.pdb 


4143 


12 


6.05 


12 


lsdk.pdb 


4384 


12 


6.00 


13 


lnok.pdb 


6759 


11 


5.73 


14 


7atl.pdb 


7106 


12 


5.70 



Table 1: Listing of the molecules used for testing. Molecules with file names ending in pdb can be 
found at the Protein Databank at http://www.pdb.bnl.gov/. Molecules with file names ending in 
mol2 can be found at the Center for Molecular Modeling at http://cmm.info.nih.gov/modeling/. 
The size column gives the number of atoms in the molecule, k! and k are explained in the beginning 
of Section 6 

which is at least to, for to that depends on the maximum number of spheres in M intersecting any 
single sphere in M. 

We emphasize again that the theoretical bounds that we obtain on S and to are crude and in 
practice (as shown next) these quantities are much smaller. 



6 Experimental Results 

In order to ease the explanation of the results, we introduce a new variable. Let k! be the number 
of spheres intersecting a given sphere in a given collection of spheres. Thus, k' is a function of a 
particular collection and a particular sphere and the constant k in the previous sections is then 
max A;' over a collection of spheres. These values are listed in Table 1 for the fourteen molecules we 
chose for the timing experiments. The molecules from the Protein Databank did not specify the 
positions of hydrogen atoms whereas the others did. This accounts for most of the difference in k 
and k' values. All timings were done on an SGI Indigo with a 250Mhz MIPS RS4400 CPU. 

For our timings, we ran the program on each of the molecules in our set for varying e and 
5: S on [10~ 6 ,10 -10 ] and e on [10 _9 ,10~ 12 ] to determine the experimental dependence of S on e. 
Figure 7 shows the results of these timings for the portion of the code not involved with computing 
perturbations. Figure 8 details the time spent in the perturbation code. 

It should be noted that almost all of the perturbation time (90 — 98%) is spent finding the 
pole direction. Table 2 shows the fraction of the ^-spheres which are free, (VB — VF)/VB in the 
notation of Section 4.1, and Table 3 shows the fraction of the pole directions that are free for a 
typical molecule. To obtain the former chart, we modified the program to attempt (and discard) 
1000 perturbations each time the center of an atom was positioned. The values shown in the table 
are the average over all center placements of the fraction of these positions that were valid. The 
latter was obtained with a similar sampling method for pole directions. 

The reason for the dramatic difference between the time spent on pole perturbation and the 



13 



CD 

b 




5000 10000 15000 20000 25000 30000 35000 40000 45000 

m x mean-k' 



Figure 7: Time to build the molecules shown in Table 1 excluding perturbation time. The horizontal 
coordinate is size-of-molecule x mean k 1 . 

Each diamond corresponds to he time to build the surface for one molecule for one particular setting 
of S and e as described in Section 6. 
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Figure 8: Ratio of perturbation time to total time versus ratio of S to e. 
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Table 2: Valid fraction of the <5-sphere for molecule 10 from Table 1. e varies across the table and 
S varies downwards. 
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Table 3: Valid fraction of pole positions for molecule 8 from Table 1. 
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Table 4: Valid fraction of the ^-sphere for the degenerate set of spheres described in Section 6. e 
varies across the table and S varies downwards. 

time spent on center perturbation can be attributed to a few additional constraints we placed on 
the poles. In order to simplify certain coding and debugging procedures, we insisted that all spheres 
share the same pole direction. In addition, the poles were required to be chosen such that no 
intersection point of a polar arc and a little circle would be within e of any other vertex of the 
arrangement, and that no polar arc be within to of the special great circle that is added as part of 
our "simplified point location" procedure (which we describe in the next section). 

We also performed an experiment similar to that of Table 2 for a purposefully degenerate col- 
lection of spheres and it is summarized in Table 4. For this set, we arranged 27 spheres in a cube 
on a unit regular grid. Each sphere had a radius of 1.05 thus producing 2 to 12 degeneracies where 
3 little circles intersect at one point on each sphere. In this highly degenerate case, most of the 
(5-sphere was free even when the ratio of S to e was as small as 10. 



15 



7 Additional Issues in Computing Spherical Arrangements 

7.1 Simplified Point Location 

We construct the spherical arrangement on each sphere by sequentially adding one sphere at 
a time to the collection. Each sphere induces a number of little circles on both itself and other 
spheres. These are added in corresponding pairs. When a new circle is added to a sphere, we must 
locate in the previously constructed arrangement a point on the little circle. For circles which do 
not encompass a pole, we choose to locate the polar tangency points. Since polar arcs must be 
extended from each tangency to the arc segments "above" and "below" the tangency, finding these 
"upper" and "lower" bounds on the polar arc is sufficient for point location. 

In order not to have to search through all of the arcs on a sphere to find the two arcs needed, we 
use a simple partitioning scheme. Each sphere is divided into sections along the equator. All arcs 
on the sphere are projected onto the plane of the equator and then outward to the equator. Each 
section on the equator stores references to the arcs whose projections pass through it. Furthermore, 
since all little circles are broken at polar tangency points, the equatorial extent of any arc segment 
can be found by projecting only its endpoints. 

Thus, for point location, the point is similarly projected onto the equator and then onto a section. 
Half of a great circle passing from one pole to the other through the point is constructed and this 
polar arc is then intersected with the arcs listed in the found section to find the arc immediately 
"above" and "below" the point. For our implementation, we chose to divide the equator into a fixed 
number of equal-size sections. 

For little circles which surround one of the poles, such a scheme will not work since there are no 
such tangency points. To ensure simple point location in this case, one great circle passing through 
the poles is maintained. Thus each little circle without tangencies will intersect this great circle 
twice. These two points are then used as starting locations for adding the rest of the little circle 
just as the tangency points would otherwise be used. 

Figure 9 shows timings of the code involved in point location during a load of molecule 10 from 
table 1. The plot is not as conclusive as we might hope (we would like to find a minimum in the 
graph that corresponds to a good trade-off between query and maintenance time) , but it certainly 
shows an initial negative slope for small numbers of sections and hints at a positive slope for larger 
values. 

7.2 Full vs. Partial Decomposition 

To evaluate the experimental advantage of either the partial or full decompositions, we took 
molecule 4 and timed the algorithm while successively increasing each radius in the radii table 
used to convert element types to radii. The radii were expanded by a constant each time and the 
calculations redone. This provided a number of different molecules with different k' values and a 
somewhat "natural" distribution of spheres in space. 

The average value of k' gives a good indication of the complexity of the spherical decompositions 
constructed since it is the average number of spheres intersecting any given sphere in the arrange- 
ment. Thus (average k 1 ) xm is the total number of little circles computed during the construction 
of the spherical decompositions. 

Figures 10, 11, and 12 detail the amount of time spent in each of the phases (decomposition 
construction, pole perturbation, and center perturbation respectively). The last figure only has one 
graph since the algorithm for finding centers does not change based on the decomposition method. 
The timings were stopped at the point where no further increase could be made in the radii due to 
limited computer time (the algorithms were allowed to run up to six hours) . The pole perturbation 
was found to be the limiting factor. 
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Figure 9: Fraction of the total time of the construction spent in locating points, divided into 
time spent maintaining the data structure and the time spent querying and locating points on the 
decomposition. 
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Figure 10: Time spent constructing the decompositions after the perturbation for full and partial 
decompositions against the mean of k 1 (the average number of little circles per sphere) 
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Figure 11: Time spent finding a valid pole direction for full and partial decompositions against the 
mean of k' (the average number of little circles per sphere) 
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Figure 12: Time spent perturbing the centers of the spheres against the mean of k! (the average 
number of little circles per sphere) 
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8 Conclusion 

We have presented a perturbation scheme for a collection of spheres in three-dimensional space. 
Our scheme is suitable for computing with finite precision arithmetic and we have presented ex- 
perimental results obtained while using standard floating point arithmetic. For a given resolution 
parameter e > we perturb the spheres such that features of the three-dimensional arrangement of 
the spheres (and hence of the two-dimensional spherical arrangement on each sphere) are at least e 
apart. Our scheme balances between the size of the perturbation, which we aim to minimize, and 
the expected running time of the scheme: The smaller the magnitude of the perturbation the longer 
the expected time it may take to compute a valid perturbation. 

The scheme that we have presented is fairly easy to program, it removes degeneracies and in 
that makes the other parts of the algorithm easier to program, and as the experimental results show 
it runs efficiently. 

Our motivation to develop this scheme is a software package that we have devised aimed to 
support geometric queries on molecular models. The new scheme has made our algorithms and data 
structures robust with only little effect on the running and reaction time of the system. We have 
also presented experimental results showing this small effect. Our software package is currently used 
by chemists working in rational drug design. 

The main direction for further research that we propose is to extend the scheme to other types 
of arrangements of geometric objects. An obvious limitation of our approach, that may make 
it unsuitable for certain applications, is that we actually move the input geometric objects from 
their given placement. However, we believe that there are applications where a small bounded 
perturbation of the input objects is permissible, since often the precision of the input objects is 
limited to start with (due to measurement limitations, for example). 
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Appendix: Shells of Degenerate Placements 

In this appendix we give more details on the shape and volume of the shells defining the regions 
F2,F 3 and F 4 as mentioned in Section 4; we use the notation established there. 

The region F2 consists of placements of the center of S t +i that induce tangency or near-tangency of 
St+i and another sphere. There are two types of tangencies: external and internal. We first describe 
the shell for external tangency. For a sphere Si, an exact tangency is induced by placing the center 
Ct+\ of St+\ at distance exactly r, +rt+\ away from the center of Si, namely, Pi(t+i) — i"i — rt+\ = 0. 
We define the potential degeneracy of this type when using floating point with resolution parameter 
e > as the union of placements of the center of S t +i such that — e < Pi(t+\) — r i — r t+i < £, which 
is a spherical shell (i.e., the region sandwiched between two concentric spheres) with center at d 
and radii r,+r i+ i —e and rj + r t + i +e. Its volume is §7r[(rj +r t +i +e) 3 — (r, +r t +i — e) 3 ]. Similarly 
the volume of a shell corresponding to internal tangency (assuming r t +\ > r,) is |7r[(r i+ i — r, + 
e) 3 - (r i+ i -n -e) 3 ]. 

The region F3 is the union of toric shells each defined for a pair of distinct spheres Si, Sj € M t . 
Let Cij denote 5, [~l Sj where both spheres are in their final placement, namely, they have possibly 
been perturbed. The toric shell for the pair Si,Sj is the loci of placements of the center of St+i 
that will cause St+i to be tangent, or almost tangent to Cij. Thus it represents the placements of 
St+i that may cause two circles on a spherical arrangement to be tangent or almost tangent. 

We denote the center of circle Cij by c,j and its radius by r,j . At a point p of tangency between 
C^ and St+i there is a plane U(p) that is tangent to S t +i, and such that its intersection with the 
plane containing Cij is a line L(p) tangent to Cij . If we fix the point p on Cij and hence the line L(p) 
there is a pencil of planes through L(p) where each plane defines a possible osculation placement of 
the sphere St+i and the circle Cij at p. The induced loci of forbidden placements for the center of 
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nfi(p) 




Figure 13: A cross-section of C,j and S t +i on the plane fl(p). 



St+i are a circle with center at p, lying on a plane fl(p) orthogonal to L(p) and having radius r t +i- 
We now extend the definition to near-tangency at p and restrict ourselves to the plane fl(p). 

We assume that the center of St+i lies on fl(p) and hence St+i (~l fi(p) is a great circle of St+i- 
We expand a circle C(p, s) of radius e around p. The forbidden placements for the center of St+i 
are now all the placement where St+i (~l fi(p) (~l C(p,e) ^ — the shaded annulus in Figure 13. 

We repeat this for every point p on Cy and obtain a toric shell. In fact it is only toric-like but for 
brevity we call it a toric shell. It would have been a real toric shell (the region sandwiched between 
two tori with the same center, axis of rotation, and major radius) were it not self intersecting as 
it may possibly be here. To bound the volume of the toric shell, we will first look at the solid of 
revolution obtained by rotating an external quarter of the annulus around the line through dj and 
orthogonal to the plane containing Cy. The obtained volume, which we denote by Vi/ 4 , is clearly 
an upper bound on quarter the volume of the toric shell. Let V\/i{r) denote the volume of a quarter 
torus with major radius r,j and minor radius r. Then V1/4 = Vi/i{rt+\ + e) — Vi/i{rt+\ — e). 
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The region F4 is the union of spherical shells each centered at a point of intersection of three 
distinct spheres in M t and having radii r t +i — £ and r t +i +£■ The spherical shell here is the loci of 
placements of the center of St+i that will cause St+i to pass through an intersection point of three 
distinct spheres in M t or very close to this intersection point. 

Let p denote such an intersection point. The loci of placements of C t +i that will cause S t +i to 
go through p constitute the sphere centered at p with radius r t +i ■ It is easily verified that if we 
wish that p will be e away from S t +i then the loci of forbidden placements are the spherical shell 
centered at p with radii r t +\ — e an r t +\ + e. Its volume is |7r[(r t+ i + e) 3 — (r t +i — e) 3 ]. 

Bounding the volume of F2,F3, and F4. Let M be a collection of spheres as defined in 
Theorem 2.1. Also, as above, let R := max™ =1 n, and let k denote the maximum number of spheres 
of M intersecting a single sphere of M. Given a parameter e > 0, we wish to determine the 
parameter S = f(k, e, R) such that the volume of the forbidden region inside B(Ct+i , 6) is less than 
half the volume of B(C t +i,S). Let VB :=Volume(B(C t +i,6)), and let VF := Volume(F 2 UF3UF4). 
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Note that there are at most k external spherical shells defining the region F 2 and at most k 
internal shells, at most (*) toric shells defining the region F 3 , and at most 2(g) spherical shells 
defining the region F4. Therefore we obtain the following bounds on the volume of these regions: 

Volume(F 2 ) < k\ir[{2R + e) 3 - (2R - e) 3 ] + k\ir[{R + e) 3 - (R - s) 3 } 
< ^7rA;(30i? 2 e + 4e 3 ) 

/ 1 \ -1 r> -1 -1 r* 

Volume(F 3 ) < ( „ ) [47r(27r + A)R 2 e + — s 3 ] < -A; 2 [47r(27r + A)R 2 e + —its 3 ] 



Volume(F 4 ) < 2 Q ^tt[(R + s) 3 -(R- s) 3 } < 2 Q ^(6i? 2 e + 2e 3 ) < ^k 3 7r(6R 2 e + 2s 3 ) 



VF < 7r J R 2 e[40A; + (47r + 8)fc 2 + -A; 3 ]+e 3 [. ..] 

< 5.137ri? 2 eA; 3 (assuming k > 10 and e -C k,R) . 
Finally, we choose S such that VB = |tt5 3 > 2VF. Thus we let S := 2ke 1 / 3 R 2 / 3 . 
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